Goto

Collaborating Authors

 background noise




LocalSignalAdaptivity: ProvableFeatureLearning inNeuralNetworksBeyondKernels

Neural Information Processing Systems

Specifically,we prove that, forasimple data distribution with sparsesignal amidst high-variance noise, a simple convolutional neural network trained using stochastic gradient descent simultaneously learnstothreshold outthenoiseandfindthesignal.







Crossing the Species Divide: Transfer Learning from Speech to Animal Sounds

Cauzinille, Jules, Miron, Marius, Pietquin, Olivier, Hagiwara, Masato, Marxer, Ricard, Rey, Arnaud, Favre, Benoit

arXiv.org Artificial Intelligence

Self-supervised speech models have demonstrated impressive performance in speech processing, but their effectiveness on non-speech data remains underexplored. We study the transfer learning capabilities of such models on bioacoustic detection and classification tasks. We show that models such as HuBERT, WavLM, and XEUS can generate rich latent representations of animal sounds across taxa. We analyze the models properties with linear probing on time-averaged representations. We then extend the approach to account for the effect of time-wise information with other downstream architectures. Finally, we study the implication of frequency range and noise on performance. Notably, our results are competitive with fine-tuned bioacoustic pre-trained models and show the impact of noise-robust pre-training setups. These findings highlight the potential of speech-based self-supervised learning as an efficient framework for advancing bioacoustic research.


RECTor: Robust and Efficient Correlation Attack on Tor

Wu, Binghui, Divakaran, Dinil Mon, Csikor, Levente, Gurusamy, Mohan

arXiv.org Artificial Intelligence

Tor is a widely used anonymity network that conceals user identities by routing traffic through encrypted relays, yet it remains vulnerable to traffic correlation attacks that deanonymize users by matching patterns in ingress and egress traffic. However, existing correlation methods suffer from two major limitations: limited robustness to noise and partial observations, and poor scalability due to computationally expensive pairwise matching. To address these challenges, we propose RECTor, a machine learning-based framework for traffic correlation under realistic conditions. RECTor employs attention-based Multiple Instance Learning (MIL) and GRU-based temporal encoding to extract robust flow representations, even when traffic data is incomplete or obfuscated. These embeddings are mapped into a shared space via a Siamese network and efficiently matched using approximate nearest neighbor (aNN) search. Empirical evaluations show that RECTor outperforms state-of-the-art baselines such as DeepCorr, DeepCOFFEA, and FlowTracker, achieving up to 60% higher true positive rates under high-noise conditions and reducing training and inference time by over 50%. Moreover, RECTor demonstrates strong scalability: inference cost grows near-linearly as the number of flows increases. These findings reveal critical vulnerabilities in Tor's anonymity model and highlight the need for advanced model-aware defenses.